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Testing  For  Exponentiality  and  Uniformity 


by 
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and 

Michael  A.  Stephens 


1.  Introduction. 

Let  Exp(a,B)  denote  the  distribution  F(x)  =  1-exp (-(x-a) /3) , 
x  >  a  where  a  and  3  are  constants  and  6  is  positive.  Suppose 


X^,...,X  is  a  random  sample  from  Exp (0,3);  the  X^  could  denote 

the  time  intervals  between  events  at  times  in  a  Poisson  process, 

so  that  T.  =  £■]  ,  X.,  i  =  l,...,n.  It  is  well  known  that  the  values 
2  i=l  i 

U, =  T./T  ,  j  =  l,...,n-l  are  then  the  order  statistics  of  a 
(j)  0  n 

sample  of  size  n-1  from  a  uniform  distribution  with  limits  0  and 


1,  written  U(0,1).  The  n  spacings  between  the  are  then 

defined  by  D.  =  U...-U,.  i  =  l,...,n  with  U,,...  E  0  and  U,  .  E  1 

7  l  (i)  (l-l)  (0)  (n) 

In  the  present  context,  Di  =  X^/Tn,  i  =  l,...,n. 

Suppose  now  that  X^,  i  =  l,...,n  is  a  random  sample  from  a  distri 
bution  Fq(x),  and  it  is  desired  to  test  either  H^:Fq(x),  is  Exp(0,.:) 

: k 

with  3  unknown,  or  the  more  general  hypothesis  Hq:Fq(x)  is 

Exp(a,3)  with  t  and  3  unknown.  Many  tests  have  been  proposed  for 

* 

Hq  and  Hq,  some  of  them  based  on  the  reduction  to  the  uniform  distri¬ 
bution  given  above.  Another  technique  is  to  plot  the  order  statistics 


X^^  against  nu,  the  expected  values  of  the  order  statistics  of  a 
sample  from  Exp(0,l);  test  statistics  can  then  be  based  on  properties 


of  the  regression  line  calculated  by  Generalised  Least  Squares  (since 
the  are  correlated).  From  these  two  very  different  approaches 

have  emerged,  for  example,  Greenwood's  statistic  based  on  the  spacings 
D^,  and  several  regression  statistics.  In  this  article  we  show  that 
some  of  the  regression  statistics  are  albegraically  related  to 
Greenwood’s,  so  that  the  tests  based  on  them  are  equivalent;  also  that 
the  distribution  of  the  Shapiro-Wiik  statistic  for  exponentiality ,  Wg, 
is  related  to  that  of  Greenwood's  statistic  for  uniformity,  so  that 
percentage  points  are  algebraically  connected. 

2.  The  Statistics. 

2.1  The  Greenwood  spacings  statistic.  This  statistic  is  usually  defined 
for  a  sample  U^,...,U  distributed  between  zero  and  one,  and  is  then 

n+1  _ 

G(n)  •l  DT 
i-i  1 

where  the  D.  are  defined  by  D.  =  U...-U,.  ..,  i  =  l,...,n+l  with 
i  i  (i)  (i-l) 

=  0  and  =  1.  In  the  context  of  testing  H^,  the  statistic 

derived  from  the  X.  would  be  G(n-l),  since  n  values  of  X. 

l  a 

produce  n-1  ordered  uniforms. 

The  null  distribution  of  G(n)  was  investigated  by  Moran  (1947) 
and  recently  there  has  been  a  revival  of  interest;  papers  giving  exact 
or  approximate  percentage  points  have  been  given  by  Burrows  (1979), 

Hill  (1979;  see  corrigendum  1981),  Currie  (1981a)  and  Stephens  (1981). 
Note  that  when  n  uniforms  are  used,  E(D.)  =  D  =  l/(n+l),  and  a 


natural  test  statistic  based  on  the  dispersion  of  the  can  be 

defined  by  G'(n)  =  (D ^-1/ (n+1) ) 2 ;  however,  this  reduces  to 

G(n)-l/(n+l)  and  so  is  equivalent  to  G(n).  The  application  of 
G(n)  to  test  for  exponentiality  has  been  studied;  e.g.,  by 
Bartholomew  (1957)  and  by  Cox  and  Lewis  (1966,  p.  163). 


2.2  Regression  statistics.  In  1972  Shapiro  and  Wilk,  following  a 

principle  earlier  successfully  applied  to  tests  for  normality, 

introduced  a  test  for  exponentiality  based  on  a  plot  of  the  X^^ 

* 

against  m^.  If  the  were  from  Exp (a, 6)  i.e.  if  were 

true,  E(X^ . j)  =  a+8nu;  the  test  statistic  is  based  on  the  ratio 
of  the  two  estimates  of  8,  that  given  by  Generalised  Least  Squares, 
and  that  given  by  the  sample  variance.  The  test  statistic  comes  to 
be 


wE(n) 


n(X-X(1)}2 

(n-l)S2 


2  2-2  - 

where  S  =  EX^  -  nX  ,  and  X  =  EX^/n;  throughout  this  section  all 
sums  will  run  from  1  to  n.  Shapiro  and  Wilk  (1972)  gave  percentage 
points  for  Wg(n),  based  on  Monte-Carlo  studies;  points  based  on 
numerical  integration  are  given  by  Currie  (1981b) . 

* 

The  statistic  W  (n)  was  intended  to  test  Hrt,  and  Hahn  and 
t,  U 

Shapiro  (1967,  p.  298)  subsequently  gave  a  modification  (called  WEq) 
to  test  Hq,  where  we  can  assume  that  the  regression  line  passes 
through  the  origin.  For  ease  of  notation  this  statistic  will  be  called 


H(n);  it  is  defined  by 


H(n)  =  S2/<n2X2)  . 


Hahn  and  Shapiro  provided  Monte  Carlo  percentage  points  for  H(n). 

Stephens  (1978)  introduced  a  test  statistic  for  H^,  motivated 
by  the  desire  to  provide  a  test  which  would  not  require  new  tables. 
The  statistic  is 


2-2 

W„(n)  =  - —  y  ~9~  , 

n{(n+l)EX:-n  X  } 


and  Stephens  (1978)  showed  that  W^(n)  would  have  the  same  null 
distribution  as  W  (n+1) ;  thus  the  Shapiro-Wilk  (1972)  tables  could 
be  used  for  Wg(n). 


3.  Equivalence  of  Test  Statistics. 


The  following  algebraic  relationships  between  statistics  G(n-l), 
H(n)  and  Ws(n)  are  easily  proved  but  have  not  been  previously  noted: 


(3.1) 


H(n)  =  G(n-l)-l/n  ; 


(3.2) 


{Wg (n) }  =  n(n+l)G(n-l)-n 


=  n(n+l)H(n)+l  . 


Thus  statistics  G(n-l),  H(n)  and  Wg(n)  provide  equivalent  tests  of 


.v  a 


^ v.  •*.  /. 


Equivalence  of  Distributions 


Furthermore,  since  W  (n)  has  the  same  distribution  as  W  (n+1) 

b  L 

the  distribution  of  W  (n+1)  is  related  to  the  other  statistics,  and 
specifically  to  that  of  G(n-l).  Let  G(n;a)  be  the  percentage  point 
at  level  a,  measured  from  the  lower  tail,  for  G(n);  similarly  define 
percentage  points  for  the  other  statistics.  Then  we  have: 

(4.1)  H(n;a)  =  G(n-l;a)  -  1/n  ; 

(4.2)  {w_(n;l-a)}  ^  =  n(n+l)G(n-l ;a)  -  n  ; 

(4.3)  {w  (n+1 ; 1-a) }_1  =  n(n+l)G(n-l;a)  -  n  . 

Ej 

There  have  been  numerous  tables  of  percentage  points  produced  for 
the  statistics  G(n),  H(n),  and  W  (n)  and  it  is  of  interest  to  assess 
the  consistency  of  these  tabulations.  To  this  end  we  define 

H* (n)  =  H(n) + 1/n 

W  (n+1)  =  {w  (n+1)  '''+n}/{n(n+l) }  . 

u  L 

The  various  tabulations  of  the  percentage  points  of  G(n),  H(n)  and 


W_(n)  are  compared  in  the  table. 


The  figures  in  column  1  are  taken  from  the  exact  values  for 
G(n;a)  obtained  by  Burrows  (1979)  and  Currie  (1981a);  in  column  2 
the  tabulation  for  G(n;ct)  of  Stephens  (1981)  using  Pearson  curves 
is  used;  column  3  uses  the  exact  values  for  W  (n;Ct)  given  by 
Currie  (1981b);  column  4  is  based  on  the  original  Monte  Carlo  values 
of  Shapiro  and  Wilk  (1972)  for  W^,(n;a)  and  column  5  is  taken  from 
the  simulation  study  of  Hahn  and  Shapiro  (1967,  p.  334). 

TABLE 

Comparison  of  Various  Tabulations 


n 

a 

G1(n;a) 

G2 (n;a) 

W*1(n+2;a) 

W*2(n+2;ci) 

H*(n+l;a) 

5 

0.05 

0.1994 

0.2026 

0.1994 

0.2001 

- 

0.95 

0.4320 

0.4330 

0.4322 

0.4368 

- 

10 

0.05 

0.1211 

0.1222 

0.1211 

0.1209 

0.116 

0.95 

0.2404 

0.2412 

- 

0.2367 

0.257 

15 

0.05 

0.0882 

0.0887 

0.0882 

0.0881 

0.086 

0.95 

- 

0.1641 

- 

0.1633 

0.176 

20 

0.05 

0.0698 

0.0700 

0.0698 

0.0697 

0.068 

0.95 

- 

0.1233 

— 

0.1233 

0.133 
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